Reinforcement Learning in Supervised Problem Domains
نویسنده
چکیده
Despite continuous advances in computing technology, today’s brute force data processing approaches may not provide the necessary advantage to win the race against the ever-growing amount of data that can be witnessed over the last decades. In this thesis, we discuss novel methods and algorithms that are capable of directing attention to relevant details and analysing it in sequence to overcome the processing bottleneck and to keep up with this data explosion. In the first of three parts, a novel exploration technique for Policy Gradient Reinforcement Learning is presented which replaces traditional additive random exploration with state-dependent exploration, exploring on a higher, more strategic level. We will show how this new exploration method converges faster and finds better global solutions than random exploration can. The second part of this thesis will introduce the concept of “data consumption” and discuss means to minimise it in supervised learning tasks by deriving classification as a sequential decision process and making it accessible to Reinforcement Learning methods. Depending on previously selected features and the internal belief state of a classifier a next feature is chosen by a sequential online feature selection that learns which features are most informative at each given time step. In experiments this attentive hybrid learning system shows significant reduction in required data for correct classification. Finally, the third major contribution of this thesis is a novel sequence learning approach that learns an explicit contextual state while traversing a sequence. This context helps distinguish the current input and mitigates the need for a predictor capable of dealing with sequential data. We show the close relationship to concepts from theoretical computer science, in particular that of deterministic finite automata and regular languages and demonstrate experimentally the capabilities of this hybrid algorithm. All three parts share in common a tight integration of Reinforcement Learning and Supervised Learning which not only delivers an orthogonal view onto this research but also establishes for the first time a general framework of such hybrid algorithms.
منابع مشابه
Semi-Supervised Apprenticeship Learning
In apprenticeship learning we aim to learn a good policy by observing the behavior of an expert or a set of experts. In particular, we consider the case where the expert acts so as to maximize an unknown reward function defined as a linear combination of a set of state features. In this paper, we consider the setting where we observe many sample trajectories (i.e., sequences of states) but only...
متن کاملIdentifying Intention Posts in Discussion Forums
This paper proposes to study the problem of identifying intention posts in online discussion forums. For example, in a discussion forum, a user wrote “I plan to buy a camera,” which indicates a buying intention. This intention can be easily exploited by advertisers. To the best of our knowledge, there is still no reported study of this problem. Our research found that this problem is particular...
متن کاملar X iv : 0 80 5 . 20 27 v 1 [ cs . L G ] 1 4 M ay 2 00 8 Rollout Sampling Approximate Policy
Several researchers have recently investigated the connection between reinforcement learning and classification. We are motivated by proposals of approximate policy iteration schemes without value functions which focus on policy representation using classifiers and address policy learning as a supervised learning problem. This paper proposes variants of an improved policy iteration scheme which...
متن کاملApproximate Policy Iteration with Demonstration Data
We propose an algorithm to solve uncertain sequential decision-making problems that utilizes two different types of data sources. The first is the data available in the conventional reinforcement learning setup: an agent interacts with the environment and receives a sequence of state transition samples alongside the corresponding reward signal. The second data source, which differentiates the s...
متن کاملStructural Abstraction Experiments in Reinforcement Learning
A challenge in applying reinforcement learning to large problems is how to manage the explosive increase in storage and time complexity. This is especially problematic in multi-agent systems, where the state space grows exponentially in the number of agents. Function approximation based on simple supervised learning is unlikely to scale to complex domains on its own, but structural abstraction ...
متن کاملA Survey of Current Techniques for Reinforcement Learning
This survey considers response generating systems that improve their behaviour using reinforcement learning. The di erence between unsupervised learning, supervised learning, and reinforcement learning is described. Two general problems concerning learning systems are presented; the credit assignment problem and the problem of perceptual aliasing. Notations and some general issues concerning re...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016